Gözeti̇msi̇z Ayirici Di̇l Modeli̇ Eği̇ti̇mi̇ Unsupervised Discriminative Language Model Training

نویسندگان

  • Erinç Dikici
  • Murat Saraçlar
چکیده

Özetçe —Bir otomatik konuşma tanıma sisteminin sonuç adımı olan ayırıcı dil modeli (ADM) eğitimi, eğitim örnekleri olarak kullandığı olası sözcük dizileri arasından en doğru olanının seçilmesini amaçlar. Gözetimli eğitimde konuşulan sözceye ait elle yazılandırılmış gerçek metin mevcuttur. Gözetimsiz eğitimde bu bilgi bulunmadığından örneklerin doğruluk dereceleri kesin olarak bilinemez. Bu çalışmada gerçek metin olmaksızın eğitim örneklerinin doğruluk derecelerinin kestirilebilmesine yönelik yöntemler araştırılmakta ve ADM eğitimi algılayıcı algoritmasının yapısal kestirim ve yeniden sıralama için uyarlanmış türevleriyle yapılmaktadır. Sonuçlar, gözetimsiz eğitimde gözetimli durumdaki kazancın yarısına varan bir iyileştirme elde edilebileceğini göstermektedir.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation

Discriminative training for machine translation has been well studied in the recent past. A limitation of the work to date is that it relies on the availability of high-quality in-domain bilingual text for supervised training. We present an unsupervised discriminative training framework to incorporate the usually plentiful target-language monolingual data by using a rough “reverse” translation ...

متن کامل

Unsupervised training methods for discriminative language modeling

Discriminative language modeling (DLM) aims to choose the most accurate word sequence by reranking the alternatives output by the automatic speech recognizer (ASR). The conventional (supervised) way of training a DLM requires a large amount of acoustic recordings together with their manual reference transcriptions. These transcriptions are used to determine the target ranks of the ASR outputs, ...

متن کامل

Lightly supervised training for risk-based discriminative language models

We propose a lightly supervised training method for a discriminative language model (DLM) based on risk minimization criteria. In lightly supervised training, pseudo labels generated by automatic speech recognition (ASR) are used as references. However, as these labels usually include recognition errors, the discriminative models estimated from such faulty reference labels may degrade ASR perfo...

متن کامل

Phrasal Cohort Based Unsupervised Discriminative Language Modeling

Simulated confusions enable the use of large text-only corpora for discriminative language modeling by hallucinating the likely recognition outputs that each (correct) sentence would be confused with. In [1], a novel approach was introduced to simulate confusions using phrasal cohorts derived directly from recognition output. However, the described approach relied on transcribed speech to deriv...

متن کامل

Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification

This paper presents, for the first time, unsupervised discriminative training of probabilistic linear discriminant analysis (unsupervised DT-PLDA). While discriminative training avoids the problem of generative training based on probabilistic model assumptions that often do not agree with actual data, it has been difficult to apply it to unsupervised scenarios because it can fit data with almos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014